Discovering Informative Patterns and Data Cleaning
نویسندگان
چکیده
We present a method for discovering informative patterns from data. With this method, large databases can be reduced to only a few representative data entries. Our framework encompasses also methods for cleaning databases containing corrupted data. Both on-line and off-line algorithms are proposed and experimentally checked on databases of handwritten images. The generality of the framework makes it an attractive candidate for new applications in knowledge discovery. Keywordsknowledge discovery, machine learning, informative patterns, data cleaning, information gain.
منابع مشابه
General Frameworks for Combined Mining: Discovering Informative Knowledge in Complex Data
Enterprise data mining applications such as mining government service data often involve multiple large heterogeneous data sources, user preferences and business impact. Business people expect data mining deliverables to inform direct business decision-making actions. In such situations, a single method or one-step mining is often limited in discovering informative knowledge. It would also be v...
متن کاملIdentifying Informative Web Content Blocks using Web Page Segmentation
Information Extraction has become an important task for discovering useful knowledge or information from the Web. A crawler system, which gathers the information from the Web, is one of the fundamental necessities of Information Extraction. A search engine uses a crawler to crawl and index web pages. Search engine takes into account only the informative content for indexing. In addition to info...
متن کاملA Survey on Preprocessing Methods for Web Usage Data
World Wide Web is a huge repository of web pages and links. It provides abundance of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Because of the tremendous usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...
متن کاملEffective web log mining and online navigational pattern prediction
The web has become the world's largest repository of knowledge. Web usage mining is the process of discovering knowledge from the interactions generated by the user in the form of access logs, cookies, and user sessions data. Web Mining consists of three different categories, namely Web Content Mining, Web Structure Mining, and Web Usage Mining (is the process of discovering knowledge from the ...
متن کاملCombined Mining Approach to Generate Patterns for Complex Data
In Data mining applications, which often involve complex data like multiple heterogeneous data sources, user preferences, decision-making actions and business impacts etc., the complete useful information cannot be obtained by using single data mining method in the form of informative patterns as that would consume more time and space, if and only if it is possible to join large relevant data s...
متن کامل